Predicting Prevalence of Influenza-Like Illness From Geo-Tagged Tweets
نویسندگان
چکیده
Modeling disease spread and distribution using social media data has become an increasingly popular research area. While Twitter data has recently been investigated for estimating disease spread, the extent to which it is representative of disease spread and distribution in a macro perspective is still an open question. In this paper, we focus on macroscale modeling of influenza-like illnesses (ILI) using a large dataset containing 8,961,932 tweets from Australia collected in 2015. We first propose modifications of the state-of-theart ILI-related tweet detection approaches to acquire a more refined dataset. We normalize the number of detected ILIrelated tweets with Internet access and Twitter penetration rates in each state. Then, we establish a state-level linear regression model between the number of ILI-related tweets and the number of real influenza notifications. The Pearson correlation coefficient of the model is 0.93. Our results indicate that: 1) a strong positive linear correlation exists between the number of ILI-related tweets and the number of recorded influenza notifications at state scale; 2) Twitter data has promising ability in helping detect influenza outbreaks; 3) taking into account the population, Internet access and Twitter penetration rates in each state enhances the prevalence modeling analysis.
منابع مشابه
Identifying Data Noises, User Biases, and System Errors in Geo-tagged Twitter Messages (Tweets)
Many social media researchers and data scientists collected geotagged tweets to conduct spatial analysis or identify spatiotemporal patterns of filtered messages for specific topics or events. This paper provides a systematic view to illustrate the characteristics (data noises, user biases, and system errors) of geo-tagged tweets from the Twitter Streaming API. First, we found that a small perc...
متن کاملWhen twitter meets foursquare: tweet location prediction using foursquare
The continued explosion of Twitter data has opened doors for many applications, such as location-based advertisement and entertainment using smartphones. Unfortunately, only about 0.58 percent of tweets are geo-tagged to date. To tackle the location sparseness problem, this paper presents a methodical approach to increasing the number of geotagged tweets by predicting the fine-grained location ...
متن کاملDiscover Patterns and Mobility of Twitter Users - A Study of Four US College Cities
Geo-tagged tweets provide useful implications for studies in human geography, urban science, location-based services, targeted advertising, and social network. This research aims to discover the patterns and mobility of Twitter users by analyzing the spatial and temporal dynamics in their tweets. Geo-tagged tweets are collected over a period of six months for four US Midwestern college cities: ...
متن کاملScaling laws in geo-located Twitter data
We observe and report on a systematic relationship between population density and Twitter use. Number of tweets, number of users and population per unit area are related by power laws, with exponents greater than one, that are consistent with each other and across a range of spatial scales. This implies that population density can accurately predict Twitter activity. Furthermore this trend can ...
متن کاملPrevalence of influenza A/H3N2 virus in northern Iran from 2011 to 2013
Background: Influenza A virus is the most virulent human pathogen and causes the most serious problem. Having epidemiological knowledge about this disease is important. The aim of this study was to determine the prevalence of influenza A/H3N2 virus infection in northern Iran from 2011 to 2013 using the real-time polymerase chain reaction (RT-PCR). Methods: In this cross-sectional study...
متن کامل